Moves glob/wildcard matching into Fact. by Stringy · Pull Request #323 · stackrox/fact

Stringy · 2026-02-23T14:15:23Z

Description

Host scanning now uses globs to only get inodes for the specific files matching the globs.

Prefix map is populated with the longest prefix for each glob e.g. /etc/**/*.conf -> /etc/
/home/user/.ssh/id_{rsa,dsa} -> /home/user/.ssh/id_

Kernel captures events based on inode first and then prefix match (this behavior is unchanged) and then userspace does a glob match on the path and host_path.

Somewhat relies on a chain of PRs in the main repo (in merge order):

stackrox/stackrox#19057
stackrox/stackrox#19063
stackrox/stackrox#19089

Checklist

Investigated and inspected CI test results
Updated documentation accordingly

Automated testing

Added unit tests
Added integration tests
Added regression tests

If any of these don't apply, please comment below.

Testing Performed

TODO(replace-me)
Use this space to explain how you tested your PR, or, if you didn't test it, why you did not do so. (Valid reasons include "CI is sufficient" or "No testable changes")
In addition to reviewing your code, reviewers must also review your testing instructions, and make sure they are sufficient.

For more details, ref the Confluence page about this section.

Host scanning now uses globs to only get inodes for the specific files matching the globs. Prefix map is populated with the longest prefix for each glob e.g. /etc/**/*.conf -> /etc/ /home/user/.ssh/id_{rsa,dsa} -> /home/user/.ssh/id_ Kernel captures events based on inode first and then prefix match (this behavior is unchanged) and then userspace does a glob match on the path and host_path.

Molter73

Awesome! Thanks for tackling this!!

fact-ebpf/src/lib.rs

Molter73 · 2026-02-23T14:30:57Z

fact/src/bpf/mod.rs

        let mut new_paths = Vec::with_capacity(paths_config.len());
+        let mut builder = GlobSetBuilder::new();
        for p in paths_config.iter() {
+            builder.add(Glob::new(&p.to_string_lossy())?);


We probably want to hard fail if a configured path is wrong, if we change the string at this point we might not match the strings configured by a user and we will not report things in there.

fact/src/bpf/mod.rs

fact/src/host_scanner.rs

Molter73

Changes look good, can we add at least one integration test with globs? Just to make sure it is working, we can always expand later.

Molter73 · 2026-02-25T11:50:54Z

/retest

Molter73 · 2026-02-25T11:57:14Z

fact-ebpf/src/bpf/main.c

  inode_key_t inode_key = inode_to_key(file->f_inode);
  const inode_value_t* inode = inode_get(&inode_key);
+  inode_key_t* inode_to_submit = &inode_key;
  switch (inode_is_monitored(inode)) {
    case NOT_MONITORED:
      if (!is_monitored(path)) {
        goto ignored;
      }
+      // Matched by path prefix only, not by inode.
+      // Set inode to NULL so userspace knows to do glob matching.
+      inode_to_submit = NULL;
      break;
    case MONITORED:
      break;
  }

-  submit_event(&m->file_open, event_type, path->path, &inode_key, true);
+  submit_event(&m->file_open, event_type, path->path, inode_to_submit, true);


I didn't want to use a map because the inode_key_t type is small, but we might want to have a per cpu array map and use a pointer to that to simplify the code even further. It would also reduce the amount of stack used for every program, which might be helpful in the future.

Anyways, I don't want to block the PR on this, we can do it in a follow up.

Molter73 · 2026-02-25T12:02:07Z

fact/src/bpf/mod.rs

+        let mut builder = GlobSetBuilder::new();
        for p in paths_config.iter() {
+            builder.add(
+                Glob::new(&p.to_string_lossy())


I thought I had mentioned this, we probably don't want to use to_string_lossy() here, this will make it so the path will not actually match if an invalid UTF-8 character is used and will silently drop events that we should emit. This will be very hard to debug, special for a user that might not be aware of this, instead we probably want to hard fail, falling back to the previous configuration or stopping altogether.

Molter73 · 2026-02-25T12:05:04Z

tests/conftest.py

    cwd = os.getcwd()
    config = {
-        'paths': [monitored_dir, '/mounted', '/container-dir'],
+        'paths': [f'{monitored_dir}/**/*', '/mounted/**/*', '/container-dir/**/*'],


It seems a bit annoying that we now have to manually specify /**/* at the end of every directory being configured. I might look into changing this behavior in the future, maybe we can add this automatically in fact if the configured path has no glob expressions in it.

I'm also curious why this matches files directly under /mounted for instance, when the glob expressions explicitly says there should be 2 / after the path.

why this matches files directly under /mounted

** will match zero or more path segments, so matching files in /mounted is expected behaviour

Molter73 · 2026-02-25T12:07:35Z

tests/test_wildcard.py

+    config, config_file = fact_config
+    config['paths'] = [
+        f'{monitored_dir}/**/*.txt',
+        f'{monitored_dir}/**/test-*.log',


Can we add a case that is something like: f'{monitored_dir}/*.cfg

This should validate we don't recursively check directories when ** is not used.

Molter73 · 2026-02-25T12:09:36Z

README.md


 ```shell
-cargo test --config 'target."cfg(all())".runner="sudo -E" --features=bpf-test
+cargo test --config 'target."cfg(all())".runner="sudo -E"' --features=bpf-test


Thanks for this one!

Molter73 · 2026-02-25T13:50:47Z

tests/test_wildcard.py

+    txt_file = os.path.join(monitored_dir, 'document.txt')
+    with open(txt_file, 'w') as f:
+        f.write('This should be captured')
+
+    # Should not match any pattern
+    log_file = os.path.join(monitored_dir, 'app.log')
+    with open(log_file, 'w') as f:
+        f.write('This should be ignored')


You probably want these the other way around, reason being the server.wait_events method does not wait for events after it caught the last one expected, so you want to trigger anything that should be ignored before things that should be caught.

And yes, I know this should be improved, probably by checking there are no events leftover in the server buffer during teardown or something.

Molter73 · 2026-02-25T14:10:32Z

tests/test_wildcard.py

+    return config, config_file
+
+
+def test_extension_wildcard(fact, wildcard_config, monitored_dir, server):


The fact fixture is configured as autouse, so you shouldn't need to explicitly ask for it in the tests.

Molter73 · 2026-02-25T14:11:08Z

tests/test_wildcard.py

Comments in this file apply to all tests in here.

Stringy added 2 commits February 23, 2026 14:09

Fmt

1c6a8e4

Molter73 reviewed Feb 23, 2026

View reviewed changes

PR review fixes

7946408

Stringy requested a review from Molter73 February 23, 2026 15:27

Molter73 reviewed Feb 23, 2026

View reviewed changes

Stringy added 3 commits February 24, 2026 12:45

Fix matching/tests and add wildcard tests

c65e6a5

Fix basic unit test

cf3a711

Fix missing single quote in README.md

d960eca

Stringy requested a review from Molter73 February 24, 2026 14:21

Stringy marked this pull request as ready for review February 24, 2026 14:21

Stringy requested review from a team and rhacs-bot as code owners February 24, 2026 14:21

Molter73 reviewed Feb 25, 2026

View reviewed changes

Stringy added 2 commits February 25, 2026 16:22

Fix tests based on PR comments

a316313

Use to_str instead of lossy

caaf860

		return config, config_file


		def test_extension_wildcard(fact, wildcard_config, monitored_dir, server):

Conversation

Stringy commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Testing Performed

Uh oh!

Molter73 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Molter73 left a comment

Choose a reason for hiding this comment

Uh oh!

Molter73 commented Feb 25, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Stringy commented Feb 23, 2026 •

edited

Loading